In document capture, a paper document is scanned and may contain confidential information such as credit card numbers, taxpayer ID, etc. While it is possible for such data to be automatically extracted using optical character recognition, it is not always accurate and there may be a need for a human operator to validate the information against what is on the paper. If the operator has access to the document image, then the confidential information may be exposed, unless it is redacted.
Redaction requires additional processing and/or human work, and is prone to errors and omissions, e.g., due to information appearing in an unexpected place, such as handwritten in a margin, and/or due to information that should be protected from disclosure appearing in multiple places in a document and not being redacted in all places.